53 research outputs found

    Automaton Meets Query Algebra: Towards a Unified Model for XQuery Evaluation over XML Data Streams

    Full text link
    Abstract. In this work, we address the efficient evaluation of XQuery expressions over continuous XML data streams, which is essential for a broad range of applications including monitoring systems and infor-mation dissemination systems. While previous work has shown that au-tomata theory is suited for on-the-fly pattern retrieval over XML data streams, we find that automata-based approaches suffer from being not as flexibly optimizable as algebraic query systems. In fact, they enforce a rigid data-driven paradigm of execution. We thus now propose a unified query model to augment automata-style processing with algebra-based query optimization techniques. The proposed model has been success-fully applied in the Raindrop stream processing system. Our experimen-tal study confirms considerable performance gains with both established optimization techniques and our novel query rewrite rules.

    Satisfiability of constraint specifications on XML documents

    Get PDF
    Jose Meseguer is one of the earliest contributors in the area of Algebraic Specification. In this paper, which we are happy to dedicate to him on the occasion of his 65th birthday, we use ideas and methods coming from that area with the aim of presenting an approach for the specification of the structure of classes of XML documents and for reasoning about them. More precisely, we specify the structure of documents using sets of constraints that are based on XPath and we present inference rules that are shown to define a sound and complete refutation procedure for checking satisfiability of a given specification using tableaux.Peer ReviewedPostprint (author's final draft

    Earliest Query Answering for Deterministic Nested Word Automata

    Get PDF
    International audienceEarliest query answering (EQA) is an objective of many recent streaming algorithms for XML query answering, that aim for close to optimal memory management. In this paper, we show that EQA is infeasible even for a small fragment of Forward XPath except if P=NP. We then present an EQA algorithm for queries and schemas defined by deterministic nested word automata (dNWAs) and distinguish a large class of dNWAs for which streaming query answering is feasible in polynomial space and time

    Schemas for Unordered XML on a DIME

    Get PDF
    We investigate schema languages for unordered XML having no relative order among siblings. First, we propose unordered regular expressions (UREs), essentially regular expressions with unordered concatenation instead of standard concatenation, that define languages of unordered words to model the allowed content of a node (i.e., collections of the labels of children). However, unrestricted UREs are computationally too expensive as we show the intractability of two fundamental decision problems for UREs: membership of an unordered word to the language of a URE and containment of two UREs. Consequently, we propose a practical and tractable restriction of UREs, disjunctive interval multiplicity expressions (DIMEs). Next, we employ DIMEs to define languages of unordered trees and propose two schema languages: disjunctive interval multiplicity schema (DIMS), and its restriction, disjunction-free interval multiplicity schema (IMS). We study the complexity of the following static analysis problems: schema satisfiability, membership of a tree to the language of a schema, schema containment, as well as twig query satisfiability, implication, and containment in the presence of schema. Finally, we study the expressive power of the proposed schema languages and compare them with yardstick languages of unordered trees (FO, MSO, and Presburger constraints) and DTDs under commutative closure. Our results show that the proposed schema languages are capable of expressing many practical languages of unordered trees and enjoy desirable computational properties.Comment: Theory of Computing System

    Combining Temporal Logics for Querying XML Documents

    Get PDF
    Abstract. Close relationships between XML navigation and temporal logics have been discovered recently, in particular between logics LTL and CTL ⋆ and XPath navigation, and between the µ-calculus and navigation based on regular expressions. This opened up the possibility of bringing model-checking techniques into the field of XML, as documents are naturally represented as labeled transition systems. Most known results of this kind, however, are limited to Boolean or unary queries, which are not always sufficient for complex querying tasks. Here we present a technique for combining temporal logics to capture nary XML queries expressible in two yardstick languages: FO and MSO. We show that by adding simple terms to the language, and combining a temporal logic for words together with a temporal logic for unary tree queries, one obtains logics that select arbitrary tuples of elements, and can thus be used as building blocks in complex query languages. We present general results on the expressiveness of such temporal logics, study their model-checking properties, and relate them to some common XML querying tasks.

    Implementing a tamper-evident database system

    No full text
    Abstract. Data integrity is an assurance that data has not been modified in an unknown or unauthorized manner. The goal of this paper is to allow a user to leverage a small amount of trusted client-side computation to achieve guarantees of integrity when interacting with a vulnerable or untrusted database server. To achieve this goal we describe a novel relational hash tree, designed for efficient database processing, and evaluate the performance penalty for integrity guarantees. We show that strong cryptographic guarantees of integrity can be provided in a relational database with modest overhead.

    A formal analysis of information disclosure in data exchange

    Get PDF
    AbstractWe perform a theoretical study of the following query-view security problem: given a view V to be published, does V logically disclose information about a confidential query S? The problem is motivated by the need to manage the risk of unintended information disclosure in today's world of universal data exchange. We present a novel information-theoretic standard for query-view security. This criterion can be used to provide a precise analysis of information disclosure for a host of data exchange scenarios, including multi-party collusion and the use of outside knowledge by an adversary trying to learn privileged facts about the database. We prove a number of theoretical results for deciding security according to this standard. We also generalize our security criterion to account for prior knowledge a user or adversary may possess, and introduce techniques for measuring the magnitude of partial disclosures. We believe these results can be a foundation for practical efforts to secure data exchange frameworks, and also illuminate a nice interaction between logic and probability theory

    Homomorphism Resolving of XPath Trees Based on Automata

    No full text
    • …
    corecore